Approximate Reduction from AUC Maximization to 1-Norm Soft Margin Optimization
نویسندگان
چکیده
Finding linear classifiers that maximize AUC scores is important in ranking research. This is naturally formulated as a 1-norm hard/soft margin optimization problem over pn pairs of p positive and n negative instances. However, directly solving the optimization problems is impractical since the problem size (pn) is quadratically larger than the given sample size (p + n). In this paper, we give (approximate) reductions from the problems to hard/soft margin optimization problems of linear size. First, for the hard margin case, we show that the problem is reduced to a hard margin optimization problem over p + n instances in which the bias constant term is to be optimized. Then, for the soft margin case, we show that the problem is approximately reduced to a soft margin optimization problem over p+n instances for which the resulting linear classifier is guaranteed to have a certain margin over pairs.
منابع مشابه
Optimizing Area Under Roc Curve with SVMs
For many years now, there is a growing interest around ROC curve for characterizing machine learning performances. This is particularly due to the fact that in real-world problems misclassification costs are not known and thus, ROC curve and related metrics such as the Area Under ROC curve (AUC) can be a more meaningful performance measures. In this paper, we propose a quadratic programming bas...
متن کاملSupport Vector Machines and Area Under ROC curve
For many years now, there is a growing interest around ROC curve for characterizing machine learning performances. This is particularly due to the fact that in real-world problems misclassification costs are not known and thus, ROC curve and related metrics such as the Area Under ROC curve (AUC) can be a more meaningful performance measures. In this paper, we propose a SVMs based algorithm for ...
متن کاملExplicit Max Margin Input Feature Selection for Nonlinear SVM using Second Order Methods
Incorporating feature selection in nonlinear SVMs leads to a large and challenging nonconvex minimization problem, which can be prone to suboptimal solutions. We use a second order optimization method that utilizes eigenvalue information and is less likely to get stuck at suboptimal solutions. We devise an alternating optimization approach to tackle the problem efficiently, breaking it down int...
متن کاملA Duality View of Boosting Algorithms
We study boosting algorithms from a new perspective. We show that the Lagrange dual problems of AdaBoost, LogitBoost and soft-margin LPBoost with generalized hinge loss are all entropy maximization problems. By looking at the dual problems of these boosting algorithms, we show that the success of boosting algorithms can be understood in terms of maintaining a better margin distribution by maxim...
متن کاملA Fast SVM-based Feature Elimination Utilizing Data Radius, Hard-Margin, Soft-Margin
Margin maximization in the hard-margin sense, proposed as feature elimination criterion by the MFE-LO method, is combined here with data radius utilization to further aim to lower generalization error, as several published bounds and bound-related formulations pertaining to lowering misclassification risk (or error) pertain to radius e.g. product of squared radius and weight vector squared norm...
متن کامل